Descoberta Automática de Relações Não-Taxonômicas a Partir de Corpus em Língua Portuguesa
نویسندگان
چکیده
Ontology construction is a complex process composed by extraction tasks for domain concepts, as well as taxonomic and non-taxonomic relations among concepts. The extraction of non-taxonomic relations is the most neglected task, specially for Portuguese texts. Therefore, this paper presents a proposal for extracting non-taxonomic relations from Portuguese texts represented by a list of concepts and their contextual information extracted automatically by ExATOlp software tool.
منابع مشابه
Abordagens para Estimar Relevância de Relações Não-Taxonômicas Extraídas de Corpus de Domínio
This paper performs a comparison between two approaches to weight the relevance of extracted non-taxonomic relations found in domain corpora. The first approach computes the relevance according to the verb absolute frequency. The second approach computes the relevance according to the verb frequency and uniqueness in each corpus using tf-dcf relevance index, an index that takes into account the...
متن کاملExtração de Contextos Definitórios a partir de Textos em Língua Portuguesa (Extraction of Defining Contexts from Texts in Portuguese) [in Portuguese]
A defining context is a part of a text or a statement that provides information about a concept, based on its use. Defining contexts extraction from texts is an important task in many applications as an aid in the construction of ontologies, the development of material aid to translation, creation of glossaries, dictionaries, among others. Thus, this paper proposes, implements and evaluates a s...
متن کاملRePort - Um Sistema de Extração de Informações Aberta para Língua Portuguesa (Report - An Open Information Extraction System for Portuguese Language)
An emerging field of research in Natural Language Processing (NLP) proposes Open Information Extraction systems (Open IE). Open IEs follow a domain-independent extraction paradigm that uses generic patterns to extract all relationships between entities. In this work, we present RePort, a method of Open IE for Portuguese, based on the ReVerb, an approach for English. Adaptations of syntactic and...
متن کاملExtracção de relações semânticas entre palavras a partir de um dicionário: o PAPEL e a sua avaliação
Neste artigo apresentamos o PAPEL, um recurso lexical para o português, constitúıdo por relações entre palavras, extráıdas de forma automática de um dicionário da ĺıngua geral através da escrita manual de gramáticas para esse efeito. Depois de contextualizarmos o tipo de recurso e as opções tomadas, fornecemos uma visão do processo da sua construção, apresentando as relações inclúıdas e a sua q...
متن کاملGeração de features para resolução de correferência: Pessoa, Local e Organização (Feature Generation for Coreference Resolution: Person, Location and Organization) [in Portuguese]
This work aims at resolving coreference in Portuguese, focusing on categories of named entities Person, Location and Organization. The proposed method uses supervised learning. To this end, the use of features that assist in the correct classification of named entities is critical. The construction and refinement of these features are of great relevance to his task. The performance of many othe...
متن کامل